perm filename OLDANL.MSS[RDG,DBL] blob sn#687850 filedate 1982-11-28 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00010 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	@Chapter[Introduction]
C00008 00003	@Chapter[Just what @i{is} an Analogy?]
C00010 00004	@Section[Similarity]
C00017 00005	@SECTION[Analogy - Informal]
C00022 00006	@SubHeading[Types of Analogy Tasks]
C00027 00007	@COMMENT{ Everything from this Page 8 upto Page 11 was removed to
C00038 00008	@BEGIN[Comment]
C00040 00009	@BEGIN[Comment] 	Outtakes
C00043 00010	@BEGIN[Comment]			Outtakes - from NewAnl
C00058 ENDMK
C⊗;
@Chapter[Introduction]
@LABEL{Intro}

The purpose of this paper is to motivate and present a 
compresensive definition of the elusive term, @i{analogy}.
Much of its emphasis, and many of its biases, can be credited to
our overall goal -- designing a general "Analogizing" program.
(See @Cite[ThesisProp] for more details.)

Chapter @Ref[QuickDefn] begins with a coarse "naive" description of analogy;
specifying the types of situations which 
(for our purposes)
do, or do not, qualify as an analogy.
Chapter @Ref[Examples] then lists a small body of quick, "local" examples;
whose breadth is (or at least seems) sufficient to span
the significant different cases,
covering both formal and informal situations.

Several proposals for a "comprehesive" definition of analogy
follow this overhead.
After discussing the positive aspects of each such "straw man",
we demonstrate its inherent limitations
by listing some "obvious, standard" analogy problems
(taken from Chapter @Ref[Examples]) 
which cannot be handled by this formalism.

Taken together, these cases motivate the 
"correct"
view of analogy, presented in Chapter @Ref[Answer].
This chapter will explain why (we feel) this model "works"
-- that is, can be used to handle a wide range of
(perhaps all?) 
analogy problems/situations.  

The next chapter, @Ref[Issues], discusses some of the remaining issues.
These are points, both minor and major, which must be resolved before
this description of @i{analogy} can be considered complete --
that is, sufficiently detailed to actually implement.

----

<<here - write this after rest of paper>>
examines at this particular definition of analogy in depth;
commenting first on its advantanges and generality, and then discussing some of
its problems -- especially those which make it difficult to actually implement.

?? Various appendices will cover additional things -- such as expected properties
of analogies (to be exhibited by any model/implementation...) and ...

@Chapter[Just what @i{is} an Analogy?]
@Label{QuickDefn}

To ground this discussion on analogy, it is essential to specify 
(at least superficially)
just what we will mean (and, importantly, do not mean) by the term analogy.
We will begin by formally defining a weaker notion, of similarity.
While this relation is NOT our objective,
It does serve to start our discussion of analogy.
After hinting at how similarity is related to analogy,
and then indicating why it is not sufficient,
we will present a quick sketch of an analogy.
That overview will help the reader focus on the "kinds" of analogies
this paper is attempting to explain, and on the types of
"analogizing tasks" it is considering.

@Section[Similarity]
@Label{Similar}

Our intuitions tell us that two things can only be analogous
if they are (somehow) sufficiently similar.
But what does it mean to be similar?
A natural notion of similarity (which we will use) defines
two objects to be similar @i{iff} there is a set of features which they share.
For example, we could claim that RDG and MRG are similar in that
each is a person, lives in Palo Alto, is associated with Stanford, etc.

@SubHeading[Formal]

We can formalize this by making @i{SIMILAR} a tertiary relation,
whose three arguments are all <Symbol, Theory> pairs.
The expression
@BEGIN[Example]
@i{SIMILAR}( <A@-{1}, TH@-{1}>, <A@-{2}, TH@-{2}>, <UNIF, COMMON> ),
@END[Example]
will mean that the analogue A@-{1}, described by the theory TH@-{1}, is similar
to A@-{2}, described by the theory TH@-{2}, with respect to the
theory COMMON.@Foot{
Actually COMMON might be merely a list of axioms -- 
it need not be their deductive closure under @i[modus monens].}
This statement is true if each statement in the theory COMMON
maps into a statement of TH@-{1} (resp., TH@-{2}) using mapping 
@g(s)@-{1} = @K[LeftDoubleBracket]A@-{1}\UNIF@K[RightDoubleBracket]
(resp., @g(s)@-{2} = @K[LeftDoubleBracket]A@-{2}\UNIF@K[RightDoubleBracket]).
Stated another way,

@BEGIN[Example]
COMMON @K[SUBSET] TH@-{1}@K[LeftDoubleBracket]A@-{1}\UNIF@K[RightDoubleBracket] @~
@K[Intersection] TH@-{2}@K[LeftDoubleBracket]A@-{2}\UNIF@K[RightDoubleBracket].
@END[Example]

Intuitively, COMMON contains those facts which are included in both theories,
modulo the actual names of the analogues.
Each TH@-{i} theory might house
additional facts about that A@-{i} analogue,
facts which are not included in COMMON.
Note finally that each A@-{i} satisfies COMMON, in a model-theoritic sense.

@SubHeading[Example]

Continuing the MRG/RDG example began above,
suppose RDG was A@-{1}, and MRG = A@-{2};
and TH@-{1} included
@BEGIN[EXAMPLE]
(MEM RDG PERSONS)
(LIVES-IN RDG PALO-ALTO)
(ASSOCIATES STANFORD RDG)
(LAST-INITIAL RDG G)
(AGE RDG 27)
(PREFERED-RECORDER RDG ALTO)
@END[EXAMPLE]

and TH@-{2} included
@BEGIN[EXAMPLE]
(MEM MRG PERSONS)
(LIVES-IN MRG PALO-ALTO)
(ASSOCIATES STANFORD MRG)
(LAST-INITIAL MRG G)
(AGE MRG 31)
(PREFERED-RECORDER MRG BASS)
@END[EXAMPLE]

Then we could construct a COMMON theory of
@BEGIN[EXAMPLE]
(MEM ? PERSONS)
(LIVES-IN ? PALO-ALTO)
(ASSOCIATES STANFORD ?)
@END[EXAMPLE]

It should be clear that
@BEGIN[EXAMPLE]
@i{SIMILAR}( <RDG, TH@-{1}>, <MRG, TH@-{2}>, <?, COMMON> ).
@END[EXAMPLE]
Notice that COMMON could have included 
@BEGIN[EXAMPLE]
(LAST-INITIAL ? G),
@END[EXAMPLE]
but didn't.  
Also, note the analogue('s name) was allowed to occur in any position of the
proposition.

@SubHeading[Critique]

There are several aspects related to this @i[SIMILAR]ity relation
which should be discussed.
First it is quite tied to representation -- the particular encoding
scheme used to store a fact may determine whether it participates in
a similarity connection.
Had we used
@BEGIN[Example]
(ASSOCIATED-WITH MRG STANFORD)
@END[Example]
to indicate MRG's association with Stanford rather than 
@BEGIN[Example]
(ASSOCIATES STANFORD MRG),
@END[Example]
[while RDG retained the 
@BEGIN[Example]
(ASSOCIATES STANFORD RDG)
@END[Example]
form of this type of assertion,]
this fact could not be included in COMMON.

Second, a more serious issue is "how does one use this connection?".
It merely explicates every syntactic connection which might join
A@-{1} to A@-{2}.
Now what?
Knowing that A is similar to B does not tell us anything new
(nor even any plausible conjectures)
about B.
As we will soon see, seeing that A is analogous to B is more useful,
as it can lead to additional conjectures about B.

@SECTION[Analogy - Informal]

For our purposes, an analogy is a relation 
which connects a pair of objects/events/things/...,
(hereafter called "analogues")
and involves other (to be specified) parameters.
The purpose of this paper is a more exact specification of this relation
-- resolving, in particular, what those other arguments to this
Analogy relation are.

Taking a more procedural (@i{i.e.}, less relational) view,
we can ask what is the "analogy" which should then be returned
by our general purpose analogizing program?
And what should be its inputs,
besides (descriptions of) the two analogues?
A "mappist", for example, would claim that the analogues are sufficient input,
and the returned analogy is simply
the mapping function which carries each relevant part/feature of one analogue
into a corresponding part/feature of the other.
We will see later that this is NOT sufficient;
as there is more which must be specified.
(See Note @Ref[Gen-Use].)

The fact that we are talking about two analogues does NOT
restrict us to similarity analogies;
this formalism can readily handle proportional analogies as well.  
The trick here is to convert the proportional analogy,
@i[A:B :: C:D], into a similarity analogy, whose analogues are the terms
@i[REL(A B)] and @i[REL(C D)],
where
@i[REL(@g<a> @g<b>)] denotes a relation which holds between
@g<a> and @g<b>.
We can now talk about this pair of relations, as we could any other object.
(Bravo for second order languages.)

However, as we are dealing with @i{pairs} of analogues,
we are NOT considering familial relations,
which deal with n-tuples of analogues or their features.
(This class includes cases like "all mammals are similar", and
"one can put the jaw bones of all mammals into correspondence",
or even the class of all familial analogies.)
We do, however, feel that a relatively
straightforward modification to
our approach will enable us to deal with such cases.

@SubHeading[Design Constraints]

Our goal is an ANALOGY relation which conforms to our
(well, my)
intuitions.
Hence, it will
be based on the @i{commonality} of the two analogues,
as was the similarity relation defined above.
We will also insist that any two things can be analogous,
where, of course, some analogies
will seem more natural/useful/meaningful than others.
Also, the analogues themselves are usually not enough to uniquely specify
an analogy -- in general there are many possible analogies joining
a pair of objects.
(Again, some of which seem "better" than others).)

@SubHeading[Types of Analogy Tasks]

<<here: merge in:
Consider case of "A is more like B than like C"
(as in "People are more like doors than like windows")
need to determine how B and C different, within constraint that each
is like A>>

@Cite[NaiveAnalogy] described a large number of analogizing tasks,
including "find something similar to this analogue" and
"given two analogues, find the best connecting analogy".
This paper will specify the types of analogizing inferences formally.
The derived formalism will be sufficiently general that it will be able
to encompass any of these.

Broadly stated, there are two basic categories of analogizing tasks.
The goal of the some tasks is to form,
or fill out,
an analogy.
(See Note @Ref[TwoCases].)
This involes finding appropriate values for the as-of-yet unspecified
(or underspecified)
arguments of ANALOGY relation.
The canonical example of this is to deduce
the analogical connection which joins two well-specified analogues.
Another case is finding the second analogue,
given the first analogue and a description of the analogy.

Each member of the other category uses such a "complete" analogy,
in some way
-- e.g. as an inference step, or via some set of (non-meta-level) rules.
The standard case here is deduce (or conjecture) some new fact about
one of the analogues, based on some assertion about the other, and the
realization that they are analogous.
Other tasks deal with one or more derived analogies @i{qua} objects
of study --
for example, selecting the best analogy, within some situation.
(This, of course, requires a measure to evaluate how good each analogy is.)

In standard usage this distinction 
-- of generating a new analogy, versus using a known analogy --
is often blurred.
In one typical situation,
the goal is to derive some (previously unrealized) fact about one analogue,
using a still to be deduced analogy.
Here one has first to generate the analogy, and then use it.

The task of generating the analogy is seldom "pure" either.
One often has some guidance,
to constrain the type of analogy which can be found.
That is, there is usually some additional constraint
which helps to restrict the value of those other parameters
of the ANALOGY relation.
For this reason, @Cite[NaiveAnalogy] used the
@BEGIN[Equation]
"A is like B, within constraint @G[a]."
@END[Equation]
to pose all such analogy problems.
It was within this context that one would conjecture assertions
which might hold about A, given some known fact about B.
Note that that document 
never defined what type of this that constraint, @G[a] was.
Here we will continue to employ this formalism,
and will attempt to define these @G[a] constraints.

@COMMENT{ Everything from this Page 8 upto Page 11 was removed to
	NEWANL.MSS[rdg,dbl] (its appendix) on 24-Nov-82.
	And then the stuff from page 12 thru 23 was transfered to
	NEWANL.MSS[rdg,dbl] as well. }
@BEGIN[Comment]
∂Discussion with MRG 21-Oct-82
(his comments on my paper)
1. Form:
   First: intro
	a motivation -- why is analogy needed?
	this paper describes a formalism, not the USE of an analogy, or ...
   Second -- lead into 5 place predicate, beginning with intuitions
	which need to be honed
   Then a demonstration of the special cases: how mapping and commonality
	
∂Thoughts based on conversation w/DBL, on 25/Oct/82

The system can hold different levels of facts -- 
depth measured wrt causal reasoning.
Analogies can be drawn at any these levels; and then "checked" by
seeing if the encompassing "causal system" can explain this...
This can produce confirmation, or invalidation (i.e. show something
is laughable.)

@END[Comment]
@BEGIN[Comment] 	Outtakes


Hence the underlying, unifying commonality may or may not be obvious
within a particular description.

-----
We went on to note that this type of description could be used to
define the analogy itself.

This skeletal description should be viewed only as a conventient formalism.
What qualifies for that constraint, @G[a]?  
And what does each such type of constraint @i{really} mean?

@Foot[
Note that in many situations
(@i{e.g.}, when the problem situation provides the needed context,)
this contraining clause is implicit in the problem statement,
and does not need to be explicitly mentioned.])

There are several techniques for finding consistent "completions"
(or "closures") of a mapping.
<<here - give some examples>>
These extending techniques, however, need to be guided.
Fortunately, it seems there are, in fact, 

Other tasks involve the use of an analogical connection:
<<here - what other tasks?>>
We mentioned above that directing an analogy is difficult.
Which features should be mapped over?

[need to indicate definitional vs assertional -- but this could be handled
with proper meta-level]

Now can generate new facts, by various solid (i.e. valid) rules
of inference, as are working with a body of axioms.

@END[Comment]
@BEGIN[Comment]			Outtakes - from NewAnl
Outline

Intro/Motivation
   [Why analogy] Analogies used all the time
   [Why paper] No one has formalized what is going on
   [What is paper] NOT a sketch of how to use an analogy;
	instead a description of what it is -- i.e. not code, ...

>2 args
   At least 2 - the analogues
   Must be something else - the context, or "answer"(mapping, theory, ...)

What are analogues
   [Other args are function of this -- e.g. mapping over SOMETHING]
   Cannot be objects themselves, ... rather ... a description of it
   --- leads to reformulation

  END[Outline]

//Below is a proof of why no logical contradiction is possible...

Unfortunately, there is no "rigorous demonstration" of why
this definition is erroneous -- that is, it does not lead to ...
We can, in fact, even prove that no such proof is possible.
<here>
First the intuition:
the analogy may depend on the context or the situation,
which should be considered when deciding whether to accept a given predicate.
In one case one predicate F1 may make sense,
(in that PredicatedOKedBy(A B F1) seems reasonable, and F1(A) can hold),
while in another, F2 seems apt;
the kicker is that F1 and F2 are incompatible, in the sense that they
cannot both hold at once -- i.e.
	F1(B) & F2(B) & (other facts re: B) => False.

Note that this task is not impossible:
these F1 and F2 much simply consider other parameters 
(i.e. the situation, somehow encoding).

-----
I tried here to reach some logical contradiction --
ie trying to "prove" anything silly about B, or to deduce False.
My method was to find some pair A, B which are analogous in two ways.
//Consider RDG and MRG, where each is both 
(i) associated with Stanford CS dept, and
    [hence if one wanted to contact someone in the CS department, one might
    reason that RDG would know him, based on the fact that MRG knew him, and
    RDG was also a co-Stanford CS dep't member.]
(ii) a recorder player.
    [hence if one had a music-related question, and knew that MRG could answer it,
    perhaps RDG could as well, as they both were recorder-ers.]

This means we would want
  PredicatedOKedBy(MRG RDG InSCS)		[3a]
and
  PredicatedOKedBy(MRG RDG Recorder-er)		[3b]
//

The (unattainable) goal here is to somehow show that these two
propositions lead to some contradiction -- either that they, by
themselves, are inconsistent; or that, when using the 2nd order fact (**),
they derive a pair of incompatible compatible solutions.

Case i) Given that we have no axioms about this PredicatedOKedBy relation,
there is no way we could assert that [3a] and [3b] are inconsistent.

[Side note: it is easy to derive that various boolean connectives in the third
argument "distribute" outside --
	PredicatedOKedBy(x y F1@K[And]F2) @K[IFF]
	PredicatedOKedBy(x y F1) @K[And] PredicatedOKedBy(x y F2),
based on [*]; but so what?]

Cae ii) How could we deduce that F1(B) and F2(B) are contradictory, when
they both must be true -- otherwise this whole claim falls down.

<<Before realizing this, I played with

Derive first F1(B), and then F2(B),
which are somehow contradicatory -- i.e. 
	F1(B) & F2(B) & (other facts re: B) => False.
Note that we must have had F1(A) and F2(A) though, to have derived these on B.

I thought that we could have some additional facts about B which lead to this
contradiction. E.g. Suppose 
	F1(x) @K[IFF] [x has >2 elements], and 
	F2(x) @K[IFF] [x has <4 elements],
Then if A has 3 elements, the F1(A) and F2(A).
Now suppose we know that B has an EVEN number of elements.

NO NO NO
There is a model B -- which must have some number of elements...
I loose...
>>>

This means we had to use some other approach.
Perhaps this F must be meta-level, or otherwise complicated.
For example, let's take the Washington is like Lincoln analogy.

Let F1(x) be true if "the (most important) number associated with x is
the cardinality of his presidency".

Thus F1(Washington) => "the relation of Washington to 1 means that Washington
is the first president".
Using [**], and the premises that
Analogous(Washington Lincoln) & F1(Washington)
	& PredicatedOKedBy(Washington Lincoln F1),
we see that F1(Lincoln) -- which means that the answer to the proportional analogy
"Washington : 1 :: Lincoln : ?" must be ?=9.

Another person may assume that F2(Washington) held, where
F2(x) iff "the (most important) number associated with x is
the value of the US Arthur which contain x's likeness".

It seems reasonable to assume that PredicatedOKedBy(Washington Lincoln F2) would
hold as well, which means, given
F2(Washington), that F2(Lincoln).
This contradicts the above result, as
"Washington : 1 :: Lincoln : ?" now means ?=5.

//End of "proof", begun on line 21

it violates one of the basic assumptions we all make about analogies:
namely, that the analogy itself may depend on the context or the situation,
and there is no way to describing this in this formalism.
(Reworded, they should be considered when deciding whether to accept a 
given predicate.)

//Note #2
If we let F1(x) mean we consider only facts about x @i{qua} president,
then F1(Washington) [can be shown to] imply that the (relevant) relation
connecting Washington to 1 is "cardinality of presidency".
Now how can we use the formalism above (i.e. [**]) to derive that F1(Lincoln),
which would, in turn imply that ?=9?
This can be deduced when
PredicatedOKedBy(Washington Lincoln F1), and F1(A);
both of which seem reasonable.

Consider now the F2 predicate, where
F2(x) means the only facts we will consider about x deal with x as part of
a US Arthur.
F2(Washington) would mean that Washington's relation to 1 is in terms of the
fact that Washington's portrait appears on a 1 dollar Arthur.
It does seem reasonable to assert that 
PredicatedOKedBy(Washington Lincoln F2), and F2(A),
which would mean that ?=5, as Lincoln appears on 5 dollar Arthurs.

So which is the answer, ?=9 or ?=5?
In the above notation, it depends on whether F1(B) or F2(B) holds.
That is, [**] does NOT lead to the contradiction F1(B) & F2(B).
But it also does NOT permit us to state that F1 is apt in some situations,
while F2 should be considered under others.
The point here is that we do want to be able to reach either conclusion,
each under its own set of circumstances.

----
*** Tangent ***
Given PredicatedOKedBy(Washington Lincoln F1) and
PredicatedOKedBy(Washington Lincoln F2); now what?
We have, in some sense, too much.
Now whenever F1(Washington) we cannot help but get F1(Lincoln).
- so?

So now that we know that [**] is insufficient, what now?
How can we "spice" it up to handle such cases?
We need to somehow incorporate something IN ADDITION TO THE ANALOGUES THEMSELVES,
somewhere in [**].
The obvious place is the ANALOGOUS relation.
Thinking of Washington and Lincoln as presidents is different from
thinking of them as portraits on various US bills --
these are TWO different analogies, and should be noted as such.

Thus far all we have established is that something else is needed.
But what?

Something about the type of analogy; or the way of considering the analogues,
...
perhaps a mapping.
We'll return to that later.
//End Note #2

@BEGIN[Multiple]
@Tag[TwoCases]
We may regard a logical derivation as
@BEGIN[Example]
@G{p}(A)
@G{s}(B)
@G{r}(A,B)
  ...

ANALOGY(A, B, ...)
  ...

NewFact(B)
@END[Example]
where that
@G{p}(A) refers to a body of facts about one of the analogues, A
(perhaps some conjunct @G{p}@-{1}(A) @K[AND] ... @K[AND] @G{p}@-{N}(A),)
and likewise
@G{s}(B) holds the facts known about the other analogue B.
The remaining @G{r}(A,B) tells some of how A and B are known to interrelate.
(Any of these statements may be empty.)
From these assertions we are able to form a (sub)conclusion that
A and B are analogous, encoded by the
@B[ANALOGOUS(A, B, ...)] statement above.
This, together with possibly other facts enables us to derive a new fact
about A (or about B, or about the connection joining these objects).

"Analogy finding" is the first part of this process, which ends when
@B[ANALOGY(A, B, ...)] statement can be proven.
"Analogy use" begins there, and leads to new statements about the analogues, ...

Note that we are @i{not} committing ourselves at this point on
what types of derivations must take place to reach these 
@B[ANALOGY(A, B, ...)] or @B[NewFact(B)] conclusions.
If the standard logical inference steps (e.g. modus ponens and friends)
are sufficient they alone will be used, based on various rules.
Otherwise we may have to define other types of derivation steps.

As an example of this find versus use dichotomy,
consider
@BEGIN[Example]
(MEM MRG PERSONS) & (NUMBER-ARMS MRG 2)
(MEM RDG PERSONS)

>>Now pretend we have the naive rule:
>>	@K[ForAll] $x, $y, $c. @u[IF] ( (MEM $x $c) & (MEM $y $c) )
>>		@u[THEN] ANALOGY($x $y).

ANALOGY(MRG RDG)

>>Everything above is considered "analogy finding".
>>Now for "analogy use".

>>Here pretend that we had the ridiculous rule:
>>	@K[ForAll] $x, $y, $r. @u[IF] ( ANALOGY($x $y) & ($r $x 2) )
>>		@u[THEN] ($r $y 2).

(NUMBER-ARMS RDG 2)

@END[Example]
@END[Multiple]

@END[Comment]